Overview

Brought to you by YData

Dataset statistics

Number of variables 12
Number of observations 418
Missing cells 414
Missing cells (%) 8.3%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 39.3 KiB
Average record size in memory 96.3 B

Variable types

Numeric 5
Categorical 4
Text 3

Alerts

Sex is highly overall correlated with Survived High correlation
Survived is highly overall correlated with Sex High correlation
Age has 86 (20.6%) missing values Missing
Cabin has 327 (78.2%) missing values Missing
PassengerId is uniformly distributed Uniform
PassengerId has unique values Unique
Name has unique values Unique
SibSp has 283 (67.7%) zeros Zeros
Parch has 324 (77.5%) zeros Zeros

Reproduction

Analysis started 2024-12-06 17:18:01.022444
Analysis finished 2024-12-06 17:18:06.743415
Duration 5.72 seconds
Software version ydata-profiling vv4.12.0
Download configuration config.json

Variables

PassengerId
Real number (ℝ)

Uniform  Unique 

Distinct 418
Distinct (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 1100.5
Minimum 892
Maximum 1309
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 3.4 KiB
2024-12-06T22:48:06.944529 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum 892
5-th percentile 912.85
Q1 996.25
median 1100.5
Q3 1204.75
95-th percentile 1288.15
Maximum 1309
Range 417
Interquartile range (IQR) 208.5

Descriptive statistics

Standard deviation 120.81046
Coefficient of variation (CV) 0.10977779
Kurtosis -1.2
Mean 1100.5
Median Absolute Deviation (MAD) 104.5
Skewness 0
Sum 460009
Variance 14595.167
Monotonicity Strictly increasing
2024-12-06T22:48:07.159470 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
892 1
 
0.2%
1205 1
 
0.2%
1177 1
 
0.2%
1176 1
 
0.2%
1175 1
 
0.2%
1174 1
 
0.2%
1173 1
 
0.2%
1172 1
 
0.2%
1171 1
 
0.2%
1170 1
 
0.2%
Other values (408) 408
97.6%
Value Count Frequency (%)
892 1
0.2%
893 1
0.2%
894 1
0.2%
895 1
0.2%
896 1
0.2%
897 1
0.2%
898 1
0.2%
899 1
0.2%
900 1
0.2%
901 1
0.2%
Value Count Frequency (%)
1309 1
0.2%
1308 1
0.2%
1307 1
0.2%
1306 1
0.2%
1305 1
0.2%
1304 1
0.2%
1303 1
0.2%
1302 1
0.2%
1301 1
0.2%
1300 1
0.2%

Survived
Categorical

High correlation 

Distinct 2
Distinct (%) 0.5%
Missing 0
Missing (%) 0.0%
Memory size 3.4 KiB
0
266 
1
152 

Length

Max length 1
Median length 1
Mean length 1
Min length 1

Characters and Unicode

Total characters 418
Distinct characters 2
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 0
2nd row 1
3rd row 0
4th row 0
5th row 1

Common Values

Value Count Frequency (%)
0 266
63.6%
1 152
36.4%

Length

2024-12-06T22:48:07.332756 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-06T22:48:07.481219 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Value Count Frequency (%)
0 266
63.6%
1 152
36.4%

Most occurring characters

Value Count Frequency (%)
0 266
63.6%
1 152
36.4%

Most occurring categories

Value Count Frequency (%)
Decimal Number 418
100.0%

Most frequent character per category

Decimal Number
Value Count Frequency (%)
0 266
63.6%
1 152
36.4%

Most occurring scripts

Value Count Frequency (%)
Common 418
100.0%

Most frequent character per script

Common
Value Count Frequency (%)
0 266
63.6%
1 152
36.4%

Most occurring blocks

Value Count Frequency (%)
ASCII 418
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
0 266
63.6%
1 152
36.4%

Pclass
Categorical

Distinct 3
Distinct (%) 0.7%
Missing 0
Missing (%) 0.0%
Memory size 3.4 KiB
3
218 
1
107 
2
93 

Length

Max length 1
Median length 1
Mean length 1
Min length 1

Characters and Unicode

Total characters 418
Distinct characters 3
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 3
2nd row 3
3rd row 2
4th row 3
5th row 3

Common Values

Value Count Frequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Length

2024-12-06T22:48:07.632264 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-06T22:48:07.779967 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Value Count Frequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Most occurring characters

Value Count Frequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Most occurring categories

Value Count Frequency (%)
Decimal Number 418
100.0%

Most frequent character per category

Decimal Number
Value Count Frequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Most occurring scripts

Value Count Frequency (%)
Common 418
100.0%

Most frequent character per script

Common
Value Count Frequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Most occurring blocks

Value Count Frequency (%)
ASCII 418
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Name
Text

Unique 

Distinct 418
Distinct (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory size 3.4 KiB
2024-12-06T22:48:08.067610 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Length

Max length 63
Median length 51
Mean length 27.483254
Min length 13

Characters and Unicode

Total characters 11488
Distinct characters 58
Distinct categories 7 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 418 ?
Unique (%) 100.0%

Sample

1st row Kelly, Mr. James
2nd row Wilkes, Mrs. James (Ellen Needs)
3rd row Myles, Mr. Thomas Francis
4th row Wirz, Mr. Albert
5th row Hirvonen, Mrs. Alexander (Helga E Lindqvist)
Value Count Frequency (%)
mr 242
 
14.0%
miss 78
 
4.5%
mrs 72
 
4.2%
john 28
 
1.6%
william 23
 
1.3%
master 21
 
1.2%
charles 16
 
0.9%
joseph 15
 
0.9%
james 14
 
0.8%
henry 14
 
0.8%
Other values (825) 1202
69.7%
2024-12-06T22:48:08.610078 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
1309
 
11.4%
r 971
 
8.5%
e 822
 
7.2%
a 786
 
6.8%
s 628
 
5.5%
i 621
 
5.4%
n 596
 
5.2%
l 526
 
4.6%
M 515
 
4.5%
o 467
 
4.1%
Other values (48) 4247
37.0%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 7395
64.4%
Uppercase Letter 1738
 
15.1%
Space Separator 1309
 
11.4%
Other Punctuation 884
 
7.7%
Open Punctuation 78
 
0.7%
Close Punctuation 78
 
0.7%
Dash Punctuation 6
 
0.1%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
r 971
13.1%
e 822
11.1%
a 786
10.6%
s 628
8.5%
i 621
8.4%
n 596
8.1%
l 526
 
7.1%
o 467
 
6.3%
t 303
 
4.1%
h 257
 
3.5%
Other values (16) 1418
19.2%
Uppercase Letter
Value Count Frequency (%)
M 515
29.6%
J 112
 
6.4%
A 103
 
5.9%
C 101
 
5.8%
E 95
 
5.5%
S 81
 
4.7%
H 80
 
4.6%
W 76
 
4.4%
B 69
 
4.0%
L 61
 
3.5%
Other values (14) 445
25.6%
Other Punctuation
Value Count Frequency (%)
. 418
47.3%
, 418
47.3%
" 44
 
5.0%
' 4
 
0.5%
Space Separator
Value Count Frequency (%)
1309
100.0%
Open Punctuation
Value Count Frequency (%)
( 78
100.0%
Close Punctuation
Value Count Frequency (%)
) 78
100.0%
Dash Punctuation
Value Count Frequency (%)
- 6
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 9133
79.5%
Common 2355
 
20.5%

Most frequent character per script

Latin
Value Count Frequency (%)
r 971
 
10.6%
e 822
 
9.0%
a 786
 
8.6%
s 628
 
6.9%
i 621
 
6.8%
n 596
 
6.5%
l 526
 
5.8%
M 515
 
5.6%
o 467
 
5.1%
t 303
 
3.3%
Other values (40) 2898
31.7%
Common
Value Count Frequency (%)
1309
55.6%
. 418
 
17.7%
, 418
 
17.7%
( 78
 
3.3%
) 78
 
3.3%
" 44
 
1.9%
- 6
 
0.3%
' 4
 
0.2%

Most occurring blocks

Value Count Frequency (%)
ASCII 11488
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
1309
 
11.4%
r 971
 
8.5%
e 822
 
7.2%
a 786
 
6.8%
s 628
 
5.5%
i 621
 
5.4%
n 596
 
5.2%
l 526
 
4.6%
M 515
 
4.5%
o 467
 
4.1%
Other values (48) 4247
37.0%

Sex
Categorical

High correlation 

Distinct 2
Distinct (%) 0.5%
Missing 0
Missing (%) 0.0%
Memory size 3.4 KiB
male
266 
female
152 

Length

Max length 6
Median length 4
Mean length 4.7272727
Min length 4

Characters and Unicode

Total characters 1976
Distinct characters 5
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row male
2nd row female
3rd row male
4th row male
5th row female

Common Values

Value Count Frequency (%)
male 266
63.6%
female 152
36.4%

Length

2024-12-06T22:48:08.813550 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-06T22:48:08.977006 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Value Count Frequency (%)
male 266
63.6%
female 152
36.4%

Most occurring characters

Value Count Frequency (%)
e 570
28.8%
m 418
21.2%
a 418
21.2%
l 418
21.2%
f 152
 
7.7%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 1976
100.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 570
28.8%
m 418
21.2%
a 418
21.2%
l 418
21.2%
f 152
 
7.7%

Most occurring scripts

Value Count Frequency (%)
Latin 1976
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 570
28.8%
m 418
21.2%
a 418
21.2%
l 418
21.2%
f 152
 
7.7%

Most occurring blocks

Value Count Frequency (%)
ASCII 1976
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 570
28.8%
m 418
21.2%
a 418
21.2%
l 418
21.2%
f 152
 
7.7%

Age
Real number (ℝ)

Missing 

Distinct 79
Distinct (%) 23.8%
Missing 86
Missing (%) 20.6%
Infinite 0
Infinite (%) 0.0%
Mean 30.27259
Minimum 0.17
Maximum 76
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 3.4 KiB
2024-12-06T22:48:09.146463 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum 0.17
5-th percentile 8
Q1 21
median 27
Q3 39
95-th percentile 57
Maximum 76
Range 75.83
Interquartile range (IQR) 18

Descriptive statistics

Standard deviation 14.181209
Coefficient of variation (CV) 0.46845047
Kurtosis 0.083783352
Mean 30.27259
Median Absolute Deviation (MAD) 8
Skewness 0.45736129
Sum 10050.5
Variance 201.1067
Monotonicity Not monotonic
2024-12-06T22:48:09.345734 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
24 17
 
4.1%
21 17
 
4.1%
22 16
 
3.8%
30 15
 
3.6%
18 13
 
3.1%
27 12
 
2.9%
26 12
 
2.9%
23 11
 
2.6%
25 11
 
2.6%
29 10
 
2.4%
Other values (69) 198
47.4%
(Missing) 86
20.6%
Value Count Frequency (%)
0.17 1
 
0.2%
0.33 1
 
0.2%
0.75 1
 
0.2%
0.83 1
 
0.2%
0.92 1
 
0.2%
1 3
0.7%
2 2
0.5%
3 1
 
0.2%
5 1
 
0.2%
6 3
0.7%
Value Count Frequency (%)
76 1
 
0.2%
67 1
 
0.2%
64 3
0.7%
63 2
0.5%
62 1
 
0.2%
61 2
0.5%
60.5 1
 
0.2%
60 3
0.7%
59 1
 
0.2%
58 1
 
0.2%

SibSp
Real number (ℝ)

Zeros 

Distinct 7
Distinct (%) 1.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 0.44736842
Minimum 0
Maximum 8
Zeros 283
Zeros (%) 67.7%
Negative 0
Negative (%) 0.0%
Memory size 3.4 KiB
2024-12-06T22:48:09.514131 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 0
Q1 0
median 0
Q3 1
95-th percentile 2
Maximum 8
Range 8
Interquartile range (IQR) 1

Descriptive statistics

Standard deviation 0.89675956
Coefficient of variation (CV) 2.0045214
Kurtosis 26.498712
Mean 0.44736842
Median Absolute Deviation (MAD) 0
Skewness 4.1683366
Sum 187
Variance 0.80417771
Monotonicity Not monotonic
2024-12-06T22:48:09.667903 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
Value Count Frequency (%)
0 283
67.7%
1 110
 
26.3%
2 14
 
3.3%
3 4
 
1.0%
4 4
 
1.0%
8 2
 
0.5%
5 1
 
0.2%
Value Count Frequency (%)
0 283
67.7%
1 110
 
26.3%
2 14
 
3.3%
3 4
 
1.0%
4 4
 
1.0%
5 1
 
0.2%
8 2
 
0.5%
Value Count Frequency (%)
8 2
 
0.5%
5 1
 
0.2%
4 4
 
1.0%
3 4
 
1.0%
2 14
 
3.3%
1 110
 
26.3%
0 283
67.7%

Parch
Real number (ℝ)

Zeros 

Distinct 8
Distinct (%) 1.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 0.3923445
Minimum 0
Maximum 9
Zeros 324
Zeros (%) 77.5%
Negative 0
Negative (%) 0.0%
Memory size 3.4 KiB
2024-12-06T22:48:09.821501 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 0
Q1 0
median 0
Q3 0
95-th percentile 2
Maximum 9
Range 9
Interquartile range (IQR) 0

Descriptive statistics

Standard deviation 0.98142888
Coefficient of variation (CV) 2.5014468
Kurtosis 31.412513
Mean 0.3923445
Median Absolute Deviation (MAD) 0
Skewness 4.6544617
Sum 164
Variance 0.96320264
Monotonicity Not monotonic
2024-12-06T22:48:09.967514 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
Value Count Frequency (%)
0 324
77.5%
1 52
 
12.4%
2 33
 
7.9%
3 3
 
0.7%
4 2
 
0.5%
9 2
 
0.5%
6 1
 
0.2%
5 1
 
0.2%
Value Count Frequency (%)
0 324
77.5%
1 52
 
12.4%
2 33
 
7.9%
3 3
 
0.7%
4 2
 
0.5%
5 1
 
0.2%
6 1
 
0.2%
9 2
 
0.5%
Value Count Frequency (%)
9 2
 
0.5%
6 1
 
0.2%
5 1
 
0.2%
4 2
 
0.5%
3 3
 
0.7%
2 33
 
7.9%
1 52
 
12.4%
0 324
77.5%

Ticket
Text

Distinct 363
Distinct (%) 86.8%
Missing 0
Missing (%) 0.0%
Memory size 3.4 KiB
2024-12-06T22:48:10.262359 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Length

Max length 18
Median length 17
Mean length 6.8755981
Min length 3

Characters and Unicode

Total characters 2874
Distinct characters 32
Distinct categories 5 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 321 ?
Unique (%) 76.8%

Sample

1st row 330911
2nd row 363272
3rd row 240276
4th row 315154
5th row 3101298
Value Count Frequency (%)
pc 32
 
5.9%
c.a 19
 
3.5%
ca 8
 
1.5%
soton/o.q 8
 
1.5%
sc/paris 7
 
1.3%
17608 5
 
0.9%
2 5
 
0.9%
a/5 5
 
0.9%
w./c 5
 
0.9%
f.c.c 4
 
0.7%
Other values (383) 445
82.0%
2024-12-06T22:48:10.898179 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
3 364
12.7%
1 311
10.8%
2 268
9.3%
7 207
 
7.2%
6 206
 
7.2%
0 204
 
7.1%
5 195
 
6.8%
4 188
 
6.5%
8 144
 
5.0%
9 137
 
4.8%
Other values (22) 650
22.6%

Most occurring categories

Value Count Frequency (%)
Decimal Number 2224
77.4%
Uppercase Letter 349
 
12.1%
Other Punctuation 172
 
6.0%
Space Separator 125
 
4.3%
Lowercase Letter 4
 
0.1%

Most frequent character per category

Uppercase Letter
Value Count Frequency (%)
C 92
26.4%
P 52
14.9%
A 51
14.6%
O 44
12.6%
S 40
11.5%
T 14
 
4.0%
N 14
 
4.0%
Q 12
 
3.4%
R 7
 
2.0%
I 7
 
2.0%
Other values (5) 16
 
4.6%
Decimal Number
Value Count Frequency (%)
3 364
16.4%
1 311
14.0%
2 268
12.1%
7 207
9.3%
6 206
9.3%
0 204
9.2%
5 195
8.8%
4 188
8.5%
8 144
 
6.5%
9 137
 
6.2%
Lowercase Letter
Value Count Frequency (%)
a 1
25.0%
r 1
25.0%
i 1
25.0%
s 1
25.0%
Other Punctuation
Value Count Frequency (%)
. 126
73.3%
/ 46
 
26.7%
Space Separator
Value Count Frequency (%)
125
100.0%

Most occurring scripts

Value Count Frequency (%)
Common 2521
87.7%
Latin 353
 
12.3%

Most frequent character per script

Latin
Value Count Frequency (%)
C 92
26.1%
P 52
14.7%
A 51
14.4%
O 44
12.5%
S 40
11.3%
T 14
 
4.0%
N 14
 
4.0%
Q 12
 
3.4%
R 7
 
2.0%
I 7
 
2.0%
Other values (9) 20
 
5.7%
Common
Value Count Frequency (%)
3 364
14.4%
1 311
12.3%
2 268
10.6%
7 207
8.2%
6 206
8.2%
0 204
8.1%
5 195
7.7%
4 188
7.5%
8 144
 
5.7%
9 137
 
5.4%
Other values (3) 297
11.8%

Most occurring blocks

Value Count Frequency (%)
ASCII 2874
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
3 364
12.7%
1 311
10.8%
2 268
9.3%
7 207
 
7.2%
6 206
 
7.2%
0 204
 
7.1%
5 195
 
6.8%
4 188
 
6.5%
8 144
 
5.0%
9 137
 
4.8%
Other values (22) 650
22.6%

Fare
Real number (ℝ)

Distinct 169
Distinct (%) 40.5%
Missing 1
Missing (%) 0.2%
Infinite 0
Infinite (%) 0.0%
Mean 35.627188
Minimum 0
Maximum 512.3292
Zeros 2
Zeros (%) 0.5%
Negative 0
Negative (%) 0.0%
Memory size 3.4 KiB
2024-12-06T22:48:11.126182 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 7.2292
Q1 7.8958
median 14.4542
Q3 31.5
95-th percentile 151.55
Maximum 512.3292
Range 512.3292
Interquartile range (IQR) 23.6042

Descriptive statistics

Standard deviation 55.907576
Coefficient of variation (CV) 1.5692391
Kurtosis 17.921595
Mean 35.627188
Median Absolute Deviation (MAD) 6.825
Skewness 3.6872133
Sum 14856.538
Variance 3125.6571
Monotonicity Not monotonic
2024-12-06T22:48:11.339694 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
7.75 21
 
5.0%
26 19
 
4.5%
8.05 17
 
4.1%
13 17
 
4.1%
10.5 11
 
2.6%
7.8958 11
 
2.6%
7.775 10
 
2.4%
7.2292 9
 
2.2%
7.225 9
 
2.2%
7.8542 8
 
1.9%
Other values (159) 285
68.2%
Value Count Frequency (%)
0 2
 
0.5%
3.1708 1
 
0.2%
6.4375 2
 
0.5%
6.4958 1
 
0.2%
6.95 1
 
0.2%
7 2
 
0.5%
7.05 2
 
0.5%
7.225 9
2.2%
7.2292 9
2.2%
7.25 5
1.2%
Value Count Frequency (%)
512.3292 1
 
0.2%
263 2
 
0.5%
262.375 5
1.2%
247.5208 1
 
0.2%
227.525 1
 
0.2%
221.7792 3
0.7%
211.5 4
1.0%
211.3375 1
 
0.2%
164.8667 2
 
0.5%
151.55 2
 
0.5%

Cabin
Text

Missing 

Distinct 76
Distinct (%) 83.5%
Missing 327
Missing (%) 78.2%
Memory size 3.4 KiB
2024-12-06T22:48:11.618427 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Length

Max length 15
Median length 3
Mean length 4.0769231
Min length 1

Characters and Unicode

Total characters 371
Distinct characters 18
Distinct categories 3 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 62 ?
Unique (%) 68.1%

Sample

1st row B45
2nd row E31
3rd row B57 B59 B63 B66
4th row B36
5th row A21
Value Count Frequency (%)
f 4
 
3.4%
b57 3
 
2.5%
b63 3
 
2.5%
b66 3
 
2.5%
b59 3
 
2.5%
c27 2
 
1.7%
e46 2
 
1.7%
c6 2
 
1.7%
c78 2
 
1.7%
b45 2
 
1.7%
Other values (80) 92
78.0%
2024-12-06T22:48:12.094059 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
C 43
11.6%
5 34
9.2%
1 33
 
8.9%
B 32
 
8.6%
6 30
 
8.1%
3 28
 
7.5%
27
 
7.3%
2 25
 
6.7%
4 21
 
5.7%
7 15
 
4.0%
Other values (8) 83
22.4%

Most occurring categories

Value Count Frequency (%)
Decimal Number 226
60.9%
Uppercase Letter 118
31.8%
Space Separator 27
 
7.3%

Most frequent character per category

Decimal Number
Value Count Frequency (%)
5 34
15.0%
1 33
14.6%
6 30
13.3%
3 28
12.4%
2 25
11.1%
4 21
9.3%
7 15
6.6%
8 14
6.2%
0 14
6.2%
9 12
 
5.3%
Uppercase Letter
Value Count Frequency (%)
C 43
36.4%
B 32
27.1%
D 14
 
11.9%
E 12
 
10.2%
F 8
 
6.8%
A 7
 
5.9%
G 2
 
1.7%
Space Separator
Value Count Frequency (%)
27
100.0%

Most occurring scripts

Value Count Frequency (%)
Common 253
68.2%
Latin 118
31.8%

Most frequent character per script

Common
Value Count Frequency (%)
5 34
13.4%
1 33
13.0%
6 30
11.9%
3 28
11.1%
27
10.7%
2 25
9.9%
4 21
8.3%
7 15
5.9%
8 14
5.5%
0 14
5.5%
Latin
Value Count Frequency (%)
C 43
36.4%
B 32
27.1%
D 14
 
11.9%
E 12
 
10.2%
F 8
 
6.8%
A 7
 
5.9%
G 2
 
1.7%

Most occurring blocks

Value Count Frequency (%)
ASCII 371
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
C 43
11.6%
5 34
9.2%
1 33
 
8.9%
B 32
 
8.6%
6 30
 
8.1%
3 28
 
7.5%
27
 
7.3%
2 25
 
6.7%
4 21
 
5.7%
7 15
 
4.0%
Other values (8) 83
22.4%

Embarked
Categorical

Distinct 3
Distinct (%) 0.7%
Missing 0
Missing (%) 0.0%
Memory size 3.4 KiB
S
270 
C
102 
Q
46 

Length

Max length 1
Median length 1
Mean length 1
Min length 1

Characters and Unicode

Total characters 418
Distinct characters 3
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Q
2nd row S
3rd row Q
4th row S
5th row S

Common Values

Value Count Frequency (%)
S 270
64.6%
C 102
 
24.4%
Q 46
 
11.0%

Length

2024-12-06T22:48:12.295607 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-06T22:48:12.457770 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Value Count Frequency (%)
s 270
64.6%
c 102
 
24.4%
q 46
 
11.0%

Most occurring characters

Value Count Frequency (%)
S 270
64.6%
C 102
 
24.4%
Q 46
 
11.0%

Most occurring categories

Value Count Frequency (%)
Uppercase Letter 418
100.0%

Most frequent character per category

Uppercase Letter
Value Count Frequency (%)
S 270
64.6%
C 102
 
24.4%
Q 46
 
11.0%

Most occurring scripts

Value Count Frequency (%)
Latin 418
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
S 270
64.6%
C 102
 
24.4%
Q 46
 
11.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 418
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
S 270
64.6%
C 102
 
24.4%
Q 46
 
11.0%

Interactions

2024-12-06T22:48:05.336871 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:02.553715 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:03.266810 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:03.953722 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:04.683060 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:05.463546 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:02.702205 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:03.397544 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:04.097150 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:04.807298 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:05.606026 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:02.840531 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:03.536706 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:04.244467 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:04.949247 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:05.757201 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:02.993965 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:03.686785 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:04.399504 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:05.089168 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:05.883759 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:03.129954 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:03.820177 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:04.533349 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
2024-12-06T22:48:05.211121 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/

Correlations

2024-12-06T22:48:12.572440 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Age Embarked Fare Parch PassengerId Pclass Sex SibSp Survived
Age 1.000 0.135 0.315 -0.130 -0.019 0.349 0.000 -0.015 0.000
Embarked 0.135 1.000 0.240 0.113 0.060 0.308 0.109 0.101 0.109
Fare 0.315 0.240 1.000 0.378 0.020 0.475 0.154 0.441 0.154
Parch -0.130 0.113 0.378 1.000 0.051 0.000 0.213 0.412 0.213
PassengerId -0.019 0.060 0.020 0.051 1.000 0.054 0.000 -0.010 0.000
Pclass 0.349 0.308 0.475 0.000 0.054 1.000 0.106 0.113 0.106
Sex 0.000 0.109 0.154 0.213 0.000 0.106 1.000 0.136 0.995
SibSp -0.015 0.101 0.441 0.412 -0.010 0.113 0.136 1.000 0.136
Survived 0.000 0.109 0.154 0.213 0.000 0.106 0.995 0.136 1.000

Missing values

2024-12-06T22:48:06.081964 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-12-06T22:48:06.357725 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-12-06T22:48:06.649551 image/svg+xml Matplotlib v3.9.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
0 892 0 3 Kelly, Mr. James male 34.5 0 0 330911 7.8292 NaN Q
1 893 1 3 Wilkes, Mrs. James (Ellen Needs) female 47.0 1 0 363272 7.0000 NaN S
2 894 0 2 Myles, Mr. Thomas Francis male 62.0 0 0 240276 9.6875 NaN Q
3 895 0 3 Wirz, Mr. Albert male 27.0 0 0 315154 8.6625 NaN S
4 896 1 3 Hirvonen, Mrs. Alexander (Helga E Lindqvist) female 22.0 1 1 3101298 12.2875 NaN S
5 897 0 3 Svensson, Mr. Johan Cervin male 14.0 0 0 7538 9.2250 NaN S
6 898 1 3 Connolly, Miss. Kate female 30.0 0 0 330972 7.6292 NaN Q
7 899 0 2 Caldwell, Mr. Albert Francis male 26.0 1 1 248738 29.0000 NaN S
8 900 1 3 Abrahim, Mrs. Joseph (Sophie Halaut Easu) female 18.0 0 0 2657 7.2292 NaN C
9 901 0 3 Davies, Mr. John Samuel male 21.0 2 0 A/4 48871 24.1500 NaN S
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
408 1300 1 3 Riordan, Miss. Johanna Hannah"" female NaN 0 0 334915 7.7208 NaN Q
409 1301 1 3 Peacock, Miss. Treasteall female 3.0 1 1 SOTON/O.Q. 3101315 13.7750 NaN S
410 1302 1 3 Naughton, Miss. Hannah female NaN 0 0 365237 7.7500 NaN Q
411 1303 1 1 Minahan, Mrs. William Edward (Lillian E Thorpe) female 37.0 1 0 19928 90.0000 C78 Q
412 1304 1 3 Henriksson, Miss. Jenny Lovisa female 28.0 0 0 347086 7.7750 NaN S
413 1305 0 3 Spector, Mr. Woolf male NaN 0 0 A.5. 3236 8.0500 NaN S
414 1306 1 1 Oliva y Ocana, Dona. Fermina female 39.0 0 0 PC 17758 108.9000 C105 C
415 1307 0 3 Saether, Mr. Simon Sivertsen male 38.5 0 0 SOTON/O.Q. 3101262 7.2500 NaN S
416 1308 0 3 Ware, Mr. Frederick male NaN 0 0 359309 8.0500 NaN S
417 1309 0 3 Peter, Master. Michael J male NaN 1 1 2668 22.3583 NaN C